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STATIONARY DISTRIBUTIONS OF A MODEL OF 
SYMPATRIC SPECIATIGN^ 

By Feng Yu 

University of Oxford 

This paper deals with a model of sympatric speciation, that is, 
speciation in the absence of geographical separation, originally pro- 
posed by U. Dieckmann and M. Doebeli in 1999. We modify their 
original model to obtain a Fleming-Viot type model and study its 
stationary distribution. We show that speciation may occur, that is, 
the stationary distribution puts most of the mass on a configuration 
that does not concentrate on the phenotype with maximum carry- 
ing capacity, if competition between phenotypes is intense enough. 
Conversely, if competition between phenotypes is not intense, then 
speciation will not occur and most of the population will have the 
phenotype with the highest carrying capacity. The length of time it 
takes speciation to occur also has a delicate dependence on the muta- 
tion parameter, and the exact shape of the carrying capacity function 
and the competition kernel. 

1. Introduction. Understanding speciation is one of tiie great problems 
in the field of evolution. According to Mayr [9], speciation means the split- 
ting of a single species into several, that is, the multiplication of species. It is 
believed that many species originated through geographically isolated pop- 
ulations of the same ancestral species. This phenomenon is relatively easy 
to understand. In contrast, sympatric speciation, in which new species arise 
without geographical isolation, is theoretically much more difficult. In this 
work, we take the recent work in Dieckmann and Doebeli [4] on sympatric 
speciation as a basis, and try to develop a model that captures the most 
important aspects of their model and yet is also amenable to rigorous math- 
ematical analysis. In Section 1.1, we briefly describe the Dieckmann-Doebeli 
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model of sympatric speciation. Their original model is very difficult to study, 
so in Section 1.2, we present a simplified model that retains almost exactly 
the fitness function found in the original Dieckmann-Doebeli model, and 
perform some nonrigorous analysis that illustrates the delicate dependence 
of transitory behavior on the exact form of the fitness function. In Section 2, 
we present our main model, a Fleming- Viot model with strong selection and 
a fitness function that retains the key features of the original Dieckmann- 
Doebeli model. The advantage of using a Fleming-Viot model is that one 
can write down the stationary distribution quite explicitly, and stationary or 
long-term behavior is usually easier to study than transitory ones. It turns 
out that the stationary distribution concentrates more and more mass near 
its global maximum as the population size becomes larger, thus identifying 
the global maximum gives a strong indication of the kind of configuration 
eventually taken up by the population. The main results are given toward 
the end of Section 2, along with some discussion of these results. The rest 
of the paper. Section 3, is devoted to proofs of various results on local and 
global maxima of the stationary distribution of the Fleming-Viot model 
introduced in Section 2. 

1.1. The Dieckmann-Doebeli model. Dieckmann and Doebeli [4] pro- 
posed a general model for sympatric speciation, for both asexual and sexual 
populations. We will briefly describe their model for the asexual population, 
since this is the model we study in this work. Their sexual model is natu- 
rally more complicated than the asexual model, but the two models have 
similar behavior. In their asexual model, each individual in the population 
is assumed to have a quantitative character (phenotype) rr S M determining 
how effectively this individual can make use of resources in the surrounding 
environment. A typical example is the beak size of a certain bird species, 
which determines the size of seeds that can be consumed by an individual 
bird. The function i^TiM— >]R^ (carrying capacity) is associated with the 
surrounding environment, where Kx denotes the number of individuals of 
phenotype x that can be supported by the environment. For example, since 
birds with small beak size (say xi) are more adapted to eating small seeds 
than birds with large beak size (say X2, X2 > a^i), K^^ will be larger than 
if the surrounding environment produces more small seeds than large seeds. 
In the Dieckmann-Doebeli model, is taken to be cexp(— (x — x)^/(2fT|j-)). 
Moreover, every pair of individuals compete at an intensity determined by 
the phenotypical distance of these two individuals. More specifically, an in- 
dividual of phenotype xi competes with an individual of phenotype X2 at 
intensity Cxi-X2i where Cx = exp(— x^/(2(T^)). Therefore each individual in 
the population interacts with the environment via the carrying capacity 
and interacts with the population via the competition kernel C. 
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Let Nx{t) denote the number of individuals with phenotype x at time t. 
At any time, an individual of phenotype x gives birth at a constant rate, and 
dies at a rate proportional to {C * N.{t))x/Kx, that is, inversely proportional 
to the x-carrying capacity, but proportional to the intensity of competition 
exerted by the population on phenotype x, the numerator (C * N.{t))x = 
J Cx-yNy{t) dy being how much competition (from every individual in the 
population) individuals with phenotype x suffer. In addition, every time an 
individual gives birth, there is a small probability that a mutation occurs 
and the phenotype of the offspring is different from that of the parent; in 
this case, the phenotypical distance between the offspring and the parent is 
then random and assumed to have a Gaussian distribution. 

Since the number of individuals of a certain phenotype increases via the 
birth mechanism at a linear rate, but decreases via the death mechanism 
at a quadratic rate, extinction of all phenotypes will occur in finite time 
with probability one, that is, = eventually. For large initial populations, 
however, extinction will happen far enough into the future that interesting 
behavior does arise before the population becomes extinct. 



i = I 



t = 100 



( = 200 
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Fig. 1. Simulation of the Dieckmann-Doebeli model with E — [—50, 50] nZ, ax = VTOOO, 
ac ~ \/600 and mutation happening to 1.5% of the births. 
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Monte-Carlo simulations, shown in Figure 1, give a fairly good idea of 
the behavior of the Dieckmann-Doebeli model for asexual populations. If 
the initial population is monomorphic (t = 1 in Figure 1), that is, concen- 
trated near a certain phenotype xq i^^-iO) / J2x ~ (^xo); then the entire 
population first moves (t = 30, 100, 200 in Figure 1) toward x, the pheno- 
type with maximum carrying capacity. If ac > ctk (this includes the case 
ac = oo, i.e., equal competition between all phenotypes), then the popula- 
tion stabilizes near phenotype x. But if ac < ctk, then the monomorphic 
population concentrated at phenotype x splits into two groups, one group 
concentrating on a phenotype < x, while the other concentrating on a phe- 
notype > X {t = 330,370,400,500 in Figure 1). In the latter case, one can 
say that one species has evolved into two distinct species. The variance of 
the Gaussian distribution used in the mutation kernel affects how different 
phenotypically the offspring can be from the parent, and seems to affect the 
speed of evolution, but not the configuration eventually taken up by the 
population. 

1.2. A conditioned Dieckmann-Doebeli model. As noted in the very first 
paragraph, the Dieckmann-Doebeli model for asexual populations is very 
difficult to study. One reason for this difficulty is because the number of 
individuals can fluctuate with time. As mentioned before, since the birth 
rate is linear but the death rate is quadratic, extinction will occur in finite 
time with probability one, which makes it somewhat meaningless to analyze 
the stationary or long-term behavior of the system. The modification we 
apply to the Dieckmann-Doebeli model is to assume constant population 
size A^, reflecting a constant carrying capacity of the overall population, 
and define a Wright-Fisher type model (for a definition of Wright -Fisher 
model and its relationship with Fleming- Viot models, see [6]) with fitness 
functions chosen to retain key ingredients of the original Dieckmann-Doebeli 
model. In contrast to the continuous-time nature of the original Dieckmann- 
Doebeli model, the modified model is discrete time. Because the number of 
individuals remains constant, analyzing the behavior of the population is 
equivalent to analyzing the empirical distribution 



where Xn, n = 1, . . . , N , denotes the phenotype of the nth individual in a 
population of size A^ and 6x is the measure that puts unit mass at phenotype 

X. 

Before we describe our choice of fitness functions, we briefly describe the 
concepts of fitness and selection. Selection occurs when individuals of differ- 
ent genotypes leave different numbers of offspring because their probabilities 
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of surviving to reproductive age are different (see [1]). If we define fitness to 
be a measure of how hkely a particular individual produces offspring that 
will survive to reproductive age, then individuals with higher fitness should 
have higher probability of being selected for reproduction. Along these lines, 
it is natural to define fitness of a phenotype as the difference between the 
birth rate and the death rate of individuals of this phenotype, therefore it 
is also natural to require the fitness function to be bounded. 

The key feature of the Dieckmann-Doebeli model is that each individual 
has a fitness that depends on both the carrying capacity associated with its 
phenotype and the configuration of the entire population. More specifically, 
the fitness of a phenotype x is an increasing function of Kx, the carrying 
capacity, but a decreasing function of (C * N)x, the competition it suffers. 
Here Nx is the number of individuals of phenotype x. With this in mind, we 
propose the following two fitness functions: 



Kx 



J2z Cx^zT^z 



Each of the two fitness function defined above is an increasing function of 
Kx and a decreasing function of (C * 7t)x. W^^^ resembles more closely the 
original Dieckmann-Doebeli model, but it has the disadvantage of being in 
a more complicated form than W^'^^ and it is also not differentiable. Our 
simplified discrete-time and discrete-space Dieckmann-Doebeli model is as 
follows: 

• At every time step t S Z^, the entire population is replaced by a new 
population of individuals, each chosen independently according to the 
distribution p.ltjir^): 

Pxit,. )-^A{y,x)^^^^^^^^^^^^^^^^ 

where the denominator X^z'^^^ is simply the normalization 
factor such that ^xPx{t,T^^) = 1 and 

\. E = [— -L, L] n Z is the phenotype space, and vr^ G V{E) is a probability 
measure on E, 

2. K -.E ^ [0, 1] is the carrying capacity, and C : Z ^ M"'" is the competition 
kernel, 

3. Wxi'K) is the fitness of phenotype x in a population with empirical dis- 
tribution TT (sometimes we notationally suppress the dependence on tt), 
and W = ^y(i) oxW = W^'^\ 
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4. A is a Markov transition matrix associated with mutation, with A{y,x) 
denoting the probabihty of an individual of phenotype y mutating to an 
individual of phenotype x. 

By Theorem 1 in [3], {vrf ,t G [0,r]} {vr*,* G [0,r]} as ^ oo, where 
^ denotes weak convergence and vrt evolves according to the following de- 
terministic dynamical system: 

7ry{t)Wy{n{t)) 



(1) Tr,{t+l)=J2My,x) 



E,vr,(t)W^,(7r(i))- 



Analyzing the dynamical system (1) is still not easy, partly because it is of a 
complicated form that is nonlinear in vr, and we cannot find any Lyapunov 
function [8] that associates with (1). Simulations of (1), however, seem to 
display some interesting behavior, which we will describe after carrying out 
some nonrigorous analysis of (1). 

Without mutation, any phenotype x with vr^ = at any time r will stay 

for all t>T. Mutation enables individuals of phenotype x to be born in 
future generations even if there are no individuals of phenotype x in the 
present generation. But if we start with a polymorphic initial measure, that 
is, TTxiO) 7^ for all x, then adding small mutation to the system should not 
cause significant changes in the behavior of (1). Therefore we assume that 
A = I and 7r(0) is polymorphic. In this case, (1) can be simplified to 

' ~ E.^.m.i^it))- 

Thus if ^ = /, then vr is a stationary distribution of (1) if and only if 

(2) TTx = -Ti-xWx{TT) 

c 

for some constant c. Condition (2) is equivalent to 

(3) Wxi'Ti') = c for all x where Tf{x) ^ 0. 

Let K and C be in the form considered by Dieckmann and Doebeli, that 
is, Kx = ex.p{-x^ /2aj^) and Cx = exp(-xV2o-^). If W = W^^\ then condi- 
tion (3) means that 

Kx = c{C * vr)(a;) for all x where t:{x) ^ 0, 

which seems to indicate that if (Tq < ctk, then vr should be close to M{0, aj^ — 
(Tq)- On the other hand, if 1/F = VF^^\ then vr is a stationary distribution if 

1 — Q2^Cx-z'^z)/ Kx is a strictly positive constant. Notice that if K and C 
are both Gaussian-shaped with Kq = Cq = 1 then tt = J\f{0, aj^ — a'^) makes 
1 — i^^Cx-z^z) / Kx constant; furthermore, this constant is strictly positive 
since (C * 7r)(0) < Kq = 1 if fjc < ax- 
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Fig. 2. Simulation of (1) with E = [-149, 149] n Z, ctx = 60, ac = 55 and W = VK'^\ 



Therefore for both W^^^ and assuming Gaussian competition and 

carrying capacity kernels, the dynamical system (1) should have Gaussian- 
shaped stationary distributions if ac < ctk- In simulations carried out by 
Dieckmann and Doebeli [4], however, ac < ax is the case that leads to spe- 
ciation, that is, the stationary distribution supposedly has two sharp well- 
separated peaks, which contradicts the analysis carried out in the previous 
paragraph. Simulations of (1) with W = W^^\ shown in Figure 2, reveal 
that if 7r(0) 5o, initially the population does split into two groups and 
begins to move apart, but as t — > oo, the empirical measure converges to 
a Gaussian-shaped hump. This suggests the possibility that in the original 
Dieckmann-Doebeli model, conditioning on the population surviving long 
enough for convergence to stationarity to occur (recall that in the original 
Dieckmann-Doebeli model, extinction occurs in finite time), speciation is 
also a transitory phenomenon, rather than a stationary phenomenon. Simu- 
lations of (1) with W = W^'^\ shown in Figure 3, do not even display tran- 
sitory speciation behavior. Instead, the initial spike at simply widens to a 
Gaussian hump centered at 0. Hence the particular form of the dependence 
on K and C *tt seems to affect whether or not speciation occurs. 

From the simulations and nonrigorous analysis above, it seems that the 
dynamical system in (1) does not have a bimodal stationary distribution if 
both K and C are taken to be Gaussian-shaped. But we would also like to 
point out that if the shape of K or C were changed just a bit, for example. 
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Fig. 3. Simulation of (1) with E = [-149, 149] n Z, an = 60, ac = 55 and W = 



making K a bit more "flat" by taking Kx = exp(— x^~^/2o"^), simulations 
then display stationary distributions that have two (or even more) modes. 
Thus the shape of the stationary distributions of the Dieckmann-Doebeli 
model seems to have a very delicate dependence on the choice of K and C. 
We also speculate that since the original Dieckmann-Doebeli model has a 
fluctuating population size whereas our nonrigorous analysis only applies to 
a model with fixed population size, this small difi'erence may also disturb 
the long-time behavior of the model enough that a Gaussian hump does not 
appear with high probability before the population becomes extinct. 

In case of this transitory behavior, we may still say speciation has oc- 
curred. A constant carrying capacity function is only an approximation of 
what actually happens in nature, where the environment a species lives in 
can change quite drastically over a long period of time. By assuming a car- 
rying capacity function that does not change over time, we are essentially 
studying what can happen to a single species over time lengths during which 
this approximation is reasonable. 
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We also have a rigorous result for this discrete-time model, for the sim- 
plest case of K and C both rectangular, that is, Kx = 1{|x|<l} and Cx = 
l[\x\<M} for some integers L and M. Theorem A. 0.14 from [10] says that 
if z/" is a convergent sequence of symmetric stationary distributions for 
the conditioned Dieckmann-Doebeli model where the mutation matrix A 
corresponds with a convolution kernel fi'^d-i + (1 — 2/i")5o + l-i-"'^! then 
i/"([-(M - L + 1),M - L + 1]) ^ as //" ^ 0; in words, the mass in the 
middle gets very small as the mutation parameter approaches zero, hence 
there exist bimodal stationary distributions if ^u" is sufficiently small. 

In the next section, we introduce the Fleming- Viot model that we study 
for the rest of the work. It is a continuous-time model that approximates a 
Moran particle system. The main advantage of a Fleming- Viot type model 
is that if the fitness function is chosen to be a quadratic form in vr, then the 
exact form of the stationary distribution is known in the literature [7]. 

2. The Fleming Viot model and main results. We work on the pheno- 
type space E = [—L,L] fi Z. Sometimes we refer to a phenotype as a site in 
E. Let 

A= |(7r_L,...,7ro,...,7rL):7ri >0 and ^ vrj = l| 

be the space of probability measures on E, that is, A = 'P{E). Members of A 
are usually denoted by vr, vr, vr^, and so on. We endow A with the following 
metric: 

(i(7r,7f) = max|7r(x) — 7!'{x)\. 

We assume a monomorphic initial condition, that is, ttq = 5x for some x E 
(in fact, we take x = mostly). 

Recalling that the essential ingredient of the original Dieckmann-Doebeli 
model is that the fitness function is an increasing function of Kx and a 
decreasing function of (C *tt)x, we define fitness nix{TT) and mean fitness 
m,r for our Fleming- Viot model (for a precise definition the Fleming- Viot 
process, see [5] or [2]) to have the following form: 

mxin) = Kx'^Bx-zK^TT^, 

(4) 

X 

where the "cooperation" kernel B can be taken to be 1 — C . We assume B 
is symmetric. In the original Dieckmann-Doebeli model, pairs of individuals 
with small phenotypical distance compete at a higher intensity than pairs 
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of individuals with large phenotypical distance; in our model, pairs of indi- 
viduals with small phenotypical distance cooperate at a lower intensity than 
pairs of individuals with large phenotypical distance. 

The term Bx-z in the definition of r?i^(7r) above can be thought of as 
a measure of how inefficiently an individual of phenotype z makes use of 
resources of type x, that is, the type that best suit individuals of phenotype 
X. For example, if individuals of phenotype z cannot makes use of resources 
of type X at all, that is, Bx-z = 1) then they contribute to an increase to 
the fitness of individuals of phenotype x, since this type z individual will 
not compete with type x individuals. On the other hand, if individuals of 
phenotype z makes perfect use of resources of type x, that is, Bx-z = 0, 
then these individuals contribute no increase to the fitness of individuals of 
phenotype x. From the point of view of a particular individual of phenotype 
X, he "prefers" (if he is selfish) all other individuals in the population to 
be of phenotypes z with Bx-z = 1, so that no other individual can make 
use of resources for which he is best adapted. Thus the term "cooperation" 
is somewhat misleading, since individuals with different phenotypes do not 
really cooperate with each other. Nevertheless, we use "cooperation" and 
"competition" to describe the effect of individuals of a certain phenotype on 
individuals of another phenotype out of convenience. If Bz = 0, then we say 
phenotypes separated by distance z do not cooperate at all (i.e., compete 
at full intensity), and if i?2 = 1, we say they cooperate at full intensity (i.e., 
do not compete at all). 

2.1. The model. Let K -.E ^ [0,1] be the carrying capacity function, and 
B-.Z^ [0,1] be the cooperation kernel. We assume B is symmetric. We 
define 



to be the generator of our Fleming- Viot process with selection and mutation, 
where 6xy = 1 ii x = y and = otherwise, and the fitness of site x in a 
population with distribution vr and the mean fitness of the population 
are defined in (4). A Fleming- Viot process with finitely many types is also 
known as a Wright-Fisher diffusion (see [2]), but to stress the continuous 
time nature and avoid confusion with the discrete time Wright -Fisher model, 
we still refer to our model as a Fleming- Viot process, which is a special case 
of the Fleming- Viot process with selection as described in Chapter 10.1.1 



Q = -^C^ - C^L + 1)tTx) + aTrx{mx{7r) - m^) 



1 d 



x=—L 



X 



(5) 




of [2]. 
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In (5), the terms that correspond with the effect of selection and replace- 
ment sampling are the following: 

g' = a E vr.(m.(vr)-M.)A + i. E ^.(5., - vr,)^-^. 

x=-L x,y=-L y 

approximates the following Moran particle system (see page 26 of [2] 
for a precise definition of Moran particle systems) with a population of N 
individuals undergoing strong selection for suitably small a (e.g., a < 1/2 if 
K<1): 

• TT^ decreases by and vr^ increases by at rate ^'^x + " 
"^x(vr^))X. 

To see this, we expand the generator Q^'^ for the particle system above for 
smooth and compactly supported /(vt-l, . . . , ttl) : M^^+^ — > M: 



x,y=~L 



xQ + |K(-^)--.(vr^)))< 



E 

x,y=~L 



N d-Kx dTTy 

x;rf(l + |(m,(vr^)-m.(vr^)))< 

x=-L ^ 
x,y=-L y 

Therefore the generators of the particle system and the Fleming- Viot pro- 
cess without mutation, Q^'^ and respectively, agree up to 0{1/N), and 
as N ^ oo, the stochastic process associated with them both converge to 
the solution of the following system of deterministic ordinary differential 
equation (ODEs). 

(6) dtTTx=cr'Kx{'mx{'K) -m^). 
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In the mutation component of G, J2x=~L ^(^ ~ (^^ + ^)'^x)d/d7rx, we use 
the simphfying assumption that the mutation is symmetric, that is, the rate 
tJ-xy = fJ-y at which phenotype x mutates to phenotype y depends on y only, 
and furthermore fiy = fi is constant in y; the latter assumption makes the 
proofs a bit cleaner. In the original Dieckmann-Doebeli model, the variance 
of the mutation kernel only affects the speed of evolution, not the even- 
tual configuration taken up by the population. Therefore the assumption of 
symmetric mutation should not affect the stationary behavior of the process 
a great deal, and it is precisely this assumption that enables one to write 
down the unique stationary distribution for the Fleming-Viot process, as 
well as a Lyapunov function for its infinite population limit. Furthermore, 
the mutation component of Q is an approximation of the A^-particle system 
that undergoes the following: 

• TT^ decreases by and vr,^ increases by at rate y/xtt^, with an 
error term of 0{fi/N). We can expand the generator Q^'^ of the particle 
system above: 



L 

^ E 

x,y=-L 



2 E a-(2^+lK^ 

x= — L ■'' 



+ 



4N 



E - 

x,y=-L 

+ 0(l/iV2), 



N 



which has a rather messy noise term (the term involving second derivatives 
of /). The interesting cases are those with small and we only retain 
the drift term (terms involving first derivatives) in the expansion of Q^^^^ 



Combining this with the drift and noise terms in the expansion of Q ' , we 
obtain the generator Q. As — > oo, the process with generator Q converges 
to the solution of the following system of deterministic ODEs: 



(7) 



{2L + l)7r^ 



One can apply Theorem 1.6.1 from [5] to establish this convergence. 

The generator Q is of the form defined in Lemma 4.1 from [7] if one speeds 
up time by in (5), and a direct application of that result implies the 
following result: 
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Proposition 2.1. For the Fleming-Viot process with generator Q, 

is the unique stationary distribution, where C is the normalizing constant 
such that is a probability measure on A. 

We define jl = fx — jj , which we assume to be positive, and write 

iy^(ci7r) = Cexpjy + ^/i - -^^ ^ logn^^^ dir-i ■ ■ ■ diTL- 

As ^ oo, we expect to concentrate more and more on the configuration 
that maximizes 

L 

(8) 14 = m^ + /i logTTa.. 

x=—L 

One can imagine a scenario where initially all birds in the population have 
beaks that specialize in eating seeds of say size 5, 5 being the most common 
size in the forest. As time passes, the selection part G, J2x=-L^^xi''T^x{''^) ~ 
rn-,^)d / diTx, moves the population toward a fitter configuration, since the 
mean fitness is a Lyapunov function of the dynamical system (6). [A 
proof of a slightly more general statement can be found in Lemma 2.4(a).] 
If the forest produces nearly as many seeds of size 4 and 6 as seeds of size 
5, but the birds can really just eat one size of seeds (e.g., if a bird's beak 
specializes in seeds of size 5, then it is very bad at eating seeds of size 4 
or 6), then it is quite possible that the population as a whole does better, 
that is, is more fit on average, if half the birds specialize in seeds of size 4, 
while the other half in seeds of size 6. This way, even though each bird has 
to spend slightly more effort to find seeds that suits her (since is slightly 
larger than or Kq), she only competes with half the population. 

Proposition 2.2 below says that as the population size becomes large and 
the mutation parameter becomes small, the stationary distribution fo- 
cuses more and more on the configuration that achieves the maximum fit- 
ness. This configuration is also the one that the population spends the most 
time in. Going back to the example in the last paragraph, if it can be verified 
that the configuration where half the birds specialize in seeds of size 4 while 
the other half in seeds of size 6 maximizes fitness, then starting from an 
initial population where all birds specialize in seeds of size 5, the population 
will eventually drift to the fitter bimodal configuration. In this case, we can 
say that speciation has occurred. Because of the stochastic nature of the 
model, eventually the population will leave this maximally fit configuration 
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and enter some less fit configuration. But this may not happen for a long 
time, after which the validity of the approximation that carrying capacity 
is constant over time may no longer be valid. 

We broadly say that speciation is likely to occur eventually if the pop- 
ulation configuration that achieves maximum has significant mass at 
phenotype(s) different from the original one (the original phenotype may 
or may not die out as a result of speciation). In general, however, it is dif- 
ficult to identify the configuration that (globally) achieves maximum V^, 
or even to verify that a certain configuration achieves it, due to the non- 
concave nature of the V-^ and even m-^. Studying local maxima of then 
becomes useful, where one can exclude certain classes of configuration from 
candidates for the global maximum of 14- • For example, if one can exclude 
configuration close to 60 as a local maximum, then the global maximum will 
have significant mass at sites other than 0, and we may also say speciation 
will eventually occur in this case. 

Mutation effects alone produces individuals of all phenotypes in E, so the 
word "significant" in our definition of speciation is taken to mean a number 
that does not go to as the mutation parameter fi goes to 0. Since mutation 
just distributes mass evenly to all sites in £', a large fi obscures the effects 
of selection, thus we are mainly interested in small fi. We attempt to bound 
mass at various phenotypes away from as ^ ^ 0. The case of = is not 
interesting, since with initial condition 6q the configuration will then remain 
monomorphic and no speciation can ever occur. 

Propositions 2.2 and 2.3 below relate local/global maximum of to 
those of when fi is small. In particular, as can be expected, they are 
quite close to each other when fx is small. 

Proposition 2.2. Let {7fi,7f2i • • • i^fc} be the finite set that consists of 
all global maxima ofrrTj^ for vr G A. For any sufficiently small e > 0, we can 
pick fi small and N large, such that 

u^(^\jBall{ni,£)^ >l-£, 

where Ball(7r,e) denotes the intersection of A and the ball of radius £ cen- 
tered at vr. 

Proposition 2.3. Let vf be the unique local maximum ofrrTj^ for vr G A 
in a small neighborhood of tt. If fl is sufficiently small, then as defined 
by (8) has a local maximum in a small neighborhood of tt. 

o 

Lemma 2.4. We have (a) V^:A^M is a Lyapunov function for the 
dynamical system 

(9) dtTTr, = vr^ (^m^ -rn^ + ^ (^-^ - {2L + 1) 




ON SYMPATRIC SPECIATION 



15 



and therefore any local maximum o/ a stationary point of (9) . 

(b) If TT is a stationary point of (9), then irix + is constant for all 
x€E, and iJ2xeJ''^^(.'^)'^^)/('^xej'^x) + fi'/{'^J2x€j'^x:) is equal to the same 
constant for all J <Z E. 

(c) Suppose TT is a stationary point of (9), then mx(7r) > (resp. <) 
if and only if tt^ > 1/(2L + 1) (resp. <). And mx^n) > my^fr) if and only if 

To say something specific about when speciation is likely to occur, we 
specialize to m, K and B of the following form in most of the results we 
establish (in fact, all results except Theorem 2.5): 

Assumption 1. We have (1) nix is of the form defined in (4). 

(2) K :E ^ (0, 1] symmetric and unimodal (i.e., increasing on [—L, 0] fl Z 
and decreasing on [0, L] n Z) with Kq = 1. 

(3) S:Z^[0,1] with 5^ = 6+ (1-6) 1{|^|>M| with 6 e [0,1]. 

With the cooperation kernel as defined in 3 above, the individuals of 
phenotype x are more efficient at using resource of types inside the interval 
{x — M, X + M) than inside [— L, x — M] U [x + M, L] , and b can be thought 
of as a measure of how efficiently individuals of phenotype x use resources 
of types {x — M, x + Af ), with 6 = meaning maximally efficient. 

If 6 = 1, then each individual can use resources of all types equally effi- 
ciently (or inefficiently), and every individual suffers exactly the same level 
of competition from the rest of the population. This actually means that 
competition plays no part in determining how fit site x is and is propor- 
tional to Kx - Therefore, since Kx is unimodal (hence Kx is strictly increasing 
in [— L,0] and strictly decreasing in [0,L]), the fitness should be unimodal, 
too. Lemma 2.4(c) says that stationary distributions of (9) has the property 
of fitter sites having more mass, thus we expect the stationary distribution 
vr to be unimodal as well. In particular, vr should attain its maximum at 
x = 0. As /i — s- 0, Proposition 2.2 tells us that we can expect the peak of vr 
concentrated around to become sharper and sharper, approaching 5o, the 
5-measure concentrated at 0. In fact. Theorem 2.5 (where we do not assume 
Assumption 1) shows that if B is of the form as in Assumption 1(3), then 6 
only needs to be close to 1 for this behavior to occur. In this case, speciation 
is not likely to occur. 

Theorem 2.5. If K : E ^ [Q,!] is symmetric and unimodal with Kq = 1, 
and Ki = K-i < Bx < 1 for all x € E, then for any e > 0, there exist fj, 
and 1/N small enough such that i^^{{tt & A-.ttx > e for some x G -E\{0}}) < 
e, where is the stationary distribution of the Fleming-Viot model with 
generator Q . 
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2.2. Results for intense competition with relatively large fi. More inter- 
esting behavior arises when there is intense competition between pairs of 
sites that are close to each other, that is, b is small and individuals of phe- 
notype x are far better at using resources of type {x — M, x + M) than other 
types. In this case, speciation is likely to occur for certain K and B. Whether 
or not speciation occurs depends on the exact shape of K and the muta- 
tion parameter ^, and is a difficult problem for general K (even assuming 
symmetry and unimodality) and fi. 

We first present a result (Theorem 2.6 below) that roughly says that if any 
local maximum vr of with ttq suitably small cannot have all the remaining 
mass on one side of 0, that is, there is significant amount of mass in both 
[—L, —1] and [1, L], which does not go to as /i — > 0. 

Theorem 2.6. Suppose Assumption 1 holds. Let M >1. If n is a sta- 
tionary point of (9) where ttq < Ki/{2M - 1)(2L + 1) and tTx > 1/(2L + 1) 
for some site x € [—L,—l] (resp. x S then 

§ - (2M - mL + 1) "^"^ w^y W^) " ) 

[resp. T.~zLl^z > Kimm{l/Ki - 1, 1)/(2(1 - b){2M - 1){2L + 1))/. 

Remark 2.7. If vr is a stationary point of (9) where ttq < Ki/{2M — 1) x 
(2L -|- 1) < 1/(2L -|- 1), then some site other than is bound to have more 
mass than the mean 1/(2L -|- 1). 

Let [cj denote the largest integer less than or equal to c, and \c\ denote 
the smallest integer larger than or equal to c. Theorem 2.6 requires the mass 
at phenotype to be rather small. Proposition 2.8 below guarantees this to 
be the case for any local maximum of if ^"^/^ is relatively large compared 
to h and the carrying capacity function decreases rapidly to very small 
levels (smaller than iJ?/'^) before reaching M or —M. 

Proposition 2.8. Define I = -{n - M) and p = [M/2] . Suppose As- 
sumption 1 holds and tt is a stationary point of (9). If p, < AKp/{AL -\- 2)^ 
and there is an n< M such that 

(10) b + Kn<{fiKp/4.f/\ 

then 

2((/iKp/4)2/3_6-/^„) 
for x G [-L, -n] U [-/, /] U [n, L] . 
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In the parameter regime of Proposition 2.8, the sites in [— L,— n] U [n,L] 
have very smaU carrying capacity such that they cannot support significant 
population, and there is enough mutation effects to force most of the popu- 
lation into intervals {—n,—l) and {l,n) for any local maximum of 14-. This 
is a different effect from mutation simply spreading mass to all sites in E 
evenly, since the mass in {—n, —I) U {I, n) does not go to as /i ^ 0, as long 
as b + Kn < is satisfied. As shown by Proposition 2.10(b) below, 

6q is a local maximum of if Km < b, which may hold in the param- 
eter regime of Proposition 2.8. But with the combined effect of mutation 
and selection, no configuration close to Sq can be a local maximum of Vt^. 
If for example {b + Kn) / {flKp/A)"^/^ 0, then Proposition 2.8 implies that 
for sufficiently small /2, vr^ < for x G [-L, -n] U [-/,/] U [n,L]. 

Furthermore, the mass in (— n, — /) U {l,n) cannot concentrate on one side of 
by Theorem 2.6. Therefore in this case, speciation is likely to occur and 
there are at least 2 new species, with phenotypes in (— n, — /) and (/,n). 

Proposition 2.8 holds even if 5o is not a local maximum of ffiTr, in which 
case Proposition 2.10(d) below provides possible existence (which is verified 
by our simulation) of a local maximum of ttTtt of form vr = p6-M + (1 — 2p)5o + 
p6m, where p may be quite small. In this case, the relatively large mutation 
effects still prevent any configuration close to vr from being a local maximum 
of Vtj, and the population is driven toward a bimodal configuration. 

2.3. Results for intense competition with small p,. If h is fixed but ^ is suf- 
ficiently small, then it is possible for a configuration close to 5q to be a local 
maximum of V^. For such a parameter regime, we first present Theorem 2.9 
below that says if there is little mass outside the intervals (— [M/2J , [M/2J ) , 
then all mass must be concentrated at site for sufficiently small /2. This re- 
sult precludes the existence of any configuration with mass spread amongst 
sites near as a local maximum of 14- ) such that if there is speciation, then 
there must be new phenotypes far away from if /i is small. 

Theorem 2.9. Let q= [Af/2j. If Assumption 1 holds and -k is a sta- 
tionary point of (9) that satisfies 



xel-L~q]Ulq,L] 





then 




Simulations indicate that all local maxima of m-jr have supports that con- 
sist of sites spread exactly M apart, but a proof of this statement remains 
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elusive. The landscape of fn-,^ is rather complicated - explicit calculations for 
low dimensional systems (L = 1 and L = 2) indicate that there are interior 
stationary points of (6) but they are all saddle points of m-,^. Therefore if 
one tries to prove the above support property of local maxima, an approach 
that only looks at the stationary points of (6) would probably not work. 

Proposition 2.10(a) below shows that the support of any local maximum 
it of cannot have a support that consists of sites spread more than M 
apart, and the rest of Proposition 2.10 gives formulas of local maxima with 
supports that consist of 1, 2 and 3 sites. Existence of local maxima with 
support that consists of more than 3 sites is also possible, but then explicit 
calculation becomes prohibitive. If jl is sufficiently small, Proposition 2.3 
provides local maxima of 1^ that are close to the local maxima of fn-,^ . 

Proposition 2.10. Suppose Assumption 1 holds and -it is a stationary 
point of (9). 

(a) // 7T has mass only on xi < ■ ■ ■ < Xk and xj^i — xj > M + 1 for some 
j, then TT cannot be a local maximum ofm-,^. 

(b) If Km = K^M < b, then 5q is the unique local maximum of in a 
small neighborhood of 5q. If Km = K^]\j > b, then is ™t o- local maximum 
ofrriT,. 

(c) Let X G \-M + 1, -1] r\'Landp = K-x+m{K-x - bK-^+u) / {2K-x x 
K-x+M — bK^^ — bK'^^j^j^i), then n = p5_x + (1 — p)5-x+M is the unique 
local maximum ofrnj^ in a small neighborhood of fc with 



2K.xK.x+M - bKl, - bKl^_^M ' 

if b, K_x-M, and K^x+2M are all < K^xK^x+Aii^ + b)/{K^x + K-x+m), 
and bKl^ + bKl^^j^.^ < 2K_xK.x+M- 

(d) Let X e [-M + 1, 0] n Z (resp., x G (0, M - 1] n Z ), 

a = K'^-M^x + ^x-M^x+M + ^x^x+Mi 

c = 2Kx-mKxKx+m{Kx-m + Kx + Kx+m) - (1 + b)a, 

P = '^'^'^^+^ (Kx-mKx+m + Kx-mKx - (1 + b)KxKx+M), 
c 



Kx-mKx+m 



(Kx-mKx + KxKx+M — (1 + b)Kx-MKx+M), 



then vr = p6x-M + Q^x + (1 — P — q)Sx+ai "is the unique local maximum of 
in a small neighborhood of vr with 

{2-b- b^)Kl_,,KlKl^,, 



2Kx-mKxKx+m{Kx-m + Kx + Kx+m) — (1 + b)a ' 
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if 



2-6>(l-6)( ^ ^ ^ 



resp., 2-6>(l-6) — + 



Kx-2M < 
Kx+2M < 



Kx Kx+M J Kx^M . 

Kx^mKxKx+m{2 -b-b^) 



(1 — b){Kx~MKx + Kx-mKx+m + KxKx+m) 

Kx-mKxKx+m{2 -b-b^) 

(1 — b){Kx^MKx + Kx^mKx+ai + KxKx+m) 



and 



2 

1 + ^ < -Kx-AlKxKx+MiKx-M + Kx + Kx+m)- 
a 

2.4. Simulation and discussion. Even though we have no general recipe 
for identifying the global maximum of 14- , but only a class of local maxima, 
it is nevertheless possible to check whether speciation is likely to occur. 
For that, we can first check whether a configuration close to 5o is a local 
maximum of if not, then speciation is likely to occur. 

As /i ^ 0, local maxima of 14 converge to those of m-,^. If we are interested 
in the behavior of the process when /i is in this regime, it suffices to check 
whether 5q is a local maximum of nij^, and Proposition 2.10(b) provides the 
answer. 

If we are interested in the behavior of the process when jl is small but 
fixed, then Theorem 2.6 and Theorem 2.9 provide partial answers. If there 
is no configuration close to 5q which is a local maximum of 14 , then by Theo- 
rem 2.9 (its contrapositive) , there must be significant mass in [—L, — [Af/2j] U 
[[M/2J , L] for sufficiently small /i, so that there are new phenotypes far away 
from 0. Moreover, if ttq is quite smah (< Ki/{2M - 1)(2L + 1)), then The- 
orem 2.6 says that there must be significant mass on both sides of site 0. 
These two results taken together mean that if configurations close to 5q is 
not a local maximum of 14 j then for any local maximum, there is significant 
(i.e., does not get small when /i gets very small) mass at more than one site, 
hence speciation is likely to occur. 

Proposition 2.8 provides a condition under which all local maxima of 
14 are bimodal. In this case, /x^/"^ is relatively large compared to b and the 
drift component (terms involving first derivatives) of Q drives the population 
toward a bimodal configuration. An example of this case is shown in Figure 4. 

On the other hand, if /i is small and 5q is a local maximum of 171^^, then a 
configuration close to Jq is a local maximum of 14 • But it does not mean that 
speciation is unlikely to occur since 5q may not be the global maximum of 
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t = t= 1000 i = 22SG t = 2500 t = 2750 



J 



Fig. 4. When fi is relatively large, So is not a local maximum and speciation occurs at 



timet = 2750. 
u = 6 X 10"^ 



Here L=1A, N = 225^ 



=0.01 + 0.99- l{|,|>io}, 



■- exp(— 2: /20) and 



mjr, Proposition 2.10(c) and (d) provides other local maxima oirriT^. If one of 
these local maxima, not (5o , is the global maximum of , then the stationary 
distribution concentrates mostly on a configuration that has at least 2 
modes, and speciation is likely to occur. But if the initial configuration is 
very close to (5o, then the drift component of Q moves vr toward the local 
maximum of that resembles b^. One must wait long enough for the noise 
component (the term involving second derivatives) of Q to drive vr away from 
(5o and to a configuration in the basin of attraction of the more fit bimodal 
configuration. Since the noise component of Q is 0(1/A^), it may take a long 
time to get away from the less fit local maximum that resembles when the 
population size is large, that is, speciation may take much longer to occur 
than in the case of b^ not being a local maximum of 1x1^^. An example of this 
case is shown in Figure 5. 



I = D 
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( = 6075O 



Fig. 5. When /j, is small, So is a local maximum and speciation occurs at t = 13500, 
much later than in Figure 4. Even after speciation occurs, the population continues 
to move to fitter local maxima, but this takes even longer. Here L = 14, A'^ = 225'^, 
Bx = 0.01 + 0.99 • l{|a;|>io}, = exp(— a;^/20) and /i — 5 x 10~^. For comparison pur- 



poses, 
left. 



the dotted lines at t = 12,500 to t — 60750 are duplicates of the figures on their 



As can be seen from the figure above, speciation occurs around time t = 
13500, but this is a different configuration from the one reached in Figure 4 
(/i = 6 X 10~^) at time t = 2750. The configuration in Figure 4 at time 
t = 2750 has the two peaks symmetrically placed on both sides of x = 0, 
at X = 5 and x = —5, but the configuration in Figure 5 (// = 5 x 10~^) at 
time t = 13,500 has the one of the peaks at x = 1 and the other at x = 
—9, a much less fit configuration than the one with symmetrically placed 
peaks. At time t = 60,750, the selection component of G succeeds in driving 
vr to a fitter configuration, with two peaks placed at x = 2 and x = —8, 
but with fi = 5 X 10~^ instead of /_i = 6 x 10~^, it will take much longer to 
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fl 

Fig. 6. A plot of speciation time (mean of 20-50 realizations) against the mutation 
parameter ^i. HereL = lA, iV = 225^, = 0.01 + 0.99 • l{|^|>io} and = exp{-x^ / 20) . 
We did not perform simulations for fi < 4.4 x 10~^ because it takes too long, but we expect 
speciation time to grow at a fast rate as fi decreases beyond 4.4 x 10^^. 

reach a configuration with peaks at x = —5 and x = 5, supposedly the global 
maximum. 

Simulations indicate that for fi > 5.028 and K, B and as given in Fig- 
ures 4 and 5, 5q is a local maximum of 14-, but 5q is not a local maximum 
of 1^ if /.i < 5.027. We surmise that the deterministic dynamical system (9) 
has a bifurcation near = fi (/i ~ 5.027 for the simulations shown in Fig- 
ures 4 and 5), which causes the drastically different speciation time of the 
Fleming-Viot process (5) when decreases from 5.5 x 10~^ to 4.5 x 10~^ 
(see Figure 6). Notice that our simulation has = 225^, so that the noise 
component is large enough for the simulation to achieve speciation in rea- 
sonable amount of time when fi < ji. When N is much larger than 225^, we 
expect the increase in the time until speciation to be much more drastic. 
And if N = oo, then speciation will never occur for /i < /i if one starts with 
initial condition 6o, since in this case, the dynamical system (9) coincides 
with (7), the infinite population limit of the Fleming-Viot process. 

Thus whether or not a configuration close to 6o is local maximum of 
affects the length of time it takes for speciation to occur, assuming that 
a configuration with significant mass at sites other than is the global 
maximum of such that speciation is likely to occur. In addition to the 
unexpected dependence on the mutation parameter n, the carrying capacity 
K and cooperation kernel B also affects greatly whether a certain configu- 
ration is a local maximum of or V^. In Proposition 2.10(b), for example, 
a small change in K ox b can change whether 5q is a local maximum of 
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Fig. 7. How to pick D. 



m^. This is somewhat similar to the behavior exhibited by the conditioned 
Dieckmann-DoebeU model, where we remarked near Figures 2 and 3 that a 
slight change va K ox C may have a large effect on the shape of stationary 
distributions of (1). 

3. Proofs. We start by proving Proposition 2.2 and Proposition 2.3, the 
two statements that relate global/local maximum of to that of m-,^, when 
^ is small. 



Proof of Proposition 2.2. We observe that m : vr i-^ is a quadratic 
function, hence a continuous and open mapping. Near each global maximum 
vfj, the Hessian matrix must be positive definite; otherwise, there would exist 
an entire subspace of global maxima. Therefore for sufficiently small e, we 
can pick open neighborhoods A and B such that B d Ac. Uf^j^ Ball(7fj, e) 
and 

inf — sup m,r > 6 

'^eB 7reA\yl 

for some positive 5. Since the vr G A that maximizes Y^^^_i^\og'nx places 
equal weights on x £ E, 

L 

(11) sup 5] log7r^<-(2L + l)log(2L + l). 

Furthermore, there exist positive 5i and 82 (independent of fi and N) and 
an open set D C B, even if some vTi's are on the boundary of A, such that 
\D\ > 61 (where is the volume of D) and tTx > S2 for all n € D and x € E 
(see Figure 7), thus 

^ 1 
inf V log7r^>-(2L+l)log-. 
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Combining the three inequahties above, we obtain 

infK- sup K><5 + /i(2L + l)f-log^ + log(2L + l)y 

Consequently, if [i and are sufficiently small, then 

inf K- - sup K > 

'^^^ 7rGA\A ^ 



and thus 



inf^ggexp((Af/2)K) ^ ^^5/4 



sup^GA\A exp((iV/2)K) 

Define C\ = sup^g^\^exp(A^y^/2), then we obtain the following bounds for 
/exp(7VK/2)(i7r: 

^ exp (^y > X (t^-) '^'^ > '5lCle^^/^ 

^^^expJ^yK^ dvr < Ci|A|, 

where we use |D| > 5i to obtain the first inequality above. The two inequal- 
ities above in turn imply that 

/^exp((iV/2)yO(i7r ^ 6ie^^/^ 



u'' (A) = ^ ; 7; " > 



/^exp((Af/2)K)(i7r " \A\+6ie^s/^ 

if N is sufficiently large. Since we are done. □ 

Part of the proof above can be easily adapted to prove a related statement 
on local maxima. 

Proof of Proposition 2.3. We use similar ideas as in the proof of 
Proposition 2.2. Near vf, the Hessian matrix of the quadratic function 7717^ 
must be positive definite. Therefore we can pick open neighborhoods A, B 
and Z such that B C Ag Z of n and 

inf — sup fn.,^ > 6 
•^eB neZ\A 

for some positive 5. Furthermore, there exists positive 82 (independent of 
^), and an open set D C B such that tt^ > 62 for all vr € D and x E, and 
thus 

^ 1 
inf V log7r^>-(2L+l)log-. 

x€D — \ 02 

x=—L 
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The two inequalities above and (11) imply 

inf 14- sup V;,>(^ + /i(2L+l)f-log^ + log(2L + l)y 

7Tez\A V 02 / 

Consequently, if /i is sufficiently small, then 

inf K - sup K- > 7T> 

and there exists a local maximum of in A. □ 

We study local maxima of 14 by checking various points in A for sta- 
tionarity when evolved according to (9). It also happens that the dynami- 
cal system (9) is almost the same as the limiting dynamical system of our 
Fleming- Viot process (7). 

Proof of Lemma 2.4. (a) Notice that d-^JfiT, = 2Y,zKxBx~zKzT^z = 
2mx- Therefore if vr evolves according to (9), then 

dtV^ = J2{dM{dt7^x) 

X 

= E(2-. + ^)-.(-.. + ^-m.-|(2L + i; 

= 2E("^- + ^)^-("^- + ^ - f (2^ + 1)) 

+ |(2i + 1)) + f + 1)) . 

where the term in the last line T^xi^x + — "Itt — 2 (^-^ + 1)) = Yl,x '^^ ^ 
nix —^TT, which is zero. This implies 

dtV^ = 2Y,T^x[rnx + ^-m^-^{2L + l)^ > 0, 

which establishes (a). 

(b) From (a), at any stationary point vr of (9), we have mx{TT) — rn^ + 
^{I/tTx — (2L + 1)) = for all x € E, therefore mx{n) +fl/2T:x is constant for 
all X € E. We define this constant to be c = mx{Tf) + fi/2%x, then mx{T^)T^x + 
jl/2 = cTTx for all x ^ E. Thus for all J C E, 

J2xej'>^x{TT)TTx ^ jl _^ 

SxeJ^x '^Y.xeJ^x 
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(c) Since 

(12) +|(^^_(2L + 1)^ =0, 

^xi^^) > TTT^TT implies I/tTx — (2L + 1) < and vice versa. The cases of mx{TT) < 
nij^ and ^^^(vr) > my('jr) are similar. □ 

In fact, according to Theorem A. 9 of [1], (9) is a so-called Svirezhev- 
Shahshahani gradient system with potential V^, that is, dtTT = W^n), where 
VV{ir) = Gt^W{-k) and Gj^ is the matrix formed by entries g^^ = iTx{6xy — 
TTy). Any gradient system, such as (9), has the property that all orbits, 
regardless of initial condition, converge to some point in the cj-limit set 

= {p:p is an accumulation point of 7r(i) as t ^ oo}. 

All points in are stationary points of (9). 

3.1. Mild competition. In this section, we establish Theorem 2.5, which 
says that speciation is impossible if competition between phenotypes close 
to each other is mild. 

Lemma 3.1. If n is a local maximum of TFIt^ with support S, that is, 
T^x > for X € 5 only, then mx{7r) are all equal for x S. In other words, if 
'nix{'^) 7^ then either tTx = or TTy = 0. 

Proof. A simple calculation involving Lagrange multipliers: 

rriw + A ^ vTa; J = =^ 2?n^ + A = 



d-Kx 

establishes the desired result. □ 



This observation enables us to establish the following: 

Lemma 3.2. If K -.E ^ (0, 1] is symmetric and unimodal with Kq = 1, 
and Ki = K-i < Bx < 1 for all x ^ E, then Jq is the unique global maximum 
of the mean fitness function . 

Proof. Assume Ki = K^i < Bx <l for all x e E. The following two 
estimates 

mo = ^0 E B^.K^TT, > Ko (mf ^ K.n, = (inf B.) ^ K.vr,, 
rrix = Kx E Bx-zK-^-K-^ < ^ K^tTz for x 7^ 
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imply that niQ > nix for all 2; 7^ if inf^ Bz > sup^_^o-^a; — -^i- Lemma 3.1 
then implies that any local maximum vr of whose support includes 
must have either ttq = 1 or ttq = 0. If ttq = 1, then n = do, and m^- = mo('7r) = 
Bq> Ki. But if TTo = 0, then for all x in the support of vr, 

rni^ = mx<KiY^ K^tt^ < Ki < Bq 

z 

by assumption. Therefore a local maximum vr whose support does not in- 
clude must have smaller mean fitness than 60, that is, So is the unique 
global maximum of rn.^. □ 

Proof of Theorem 2.5. A direct apphcation of Proposition 2.2 leads 
to the desired result, which actually applies to very general B (but still 
requires K to be symmetric and unimodal). □ 

3.2. Intense competition. Now we focus on the case of intense competi- 
tion, that is, b is small. The goal is to establish some conditions under which 
speciation is likely to occur. The first result in this direction is Theorem 2.6, 
which roughly says that any local maximum tt of with ttq suitably small 
cannot have all remaining mass on one side of 0. For this, we first establish 
the following lemma, which assumes that there is significant mass at site 
—y to the left of site 0. If —y > —M, then the lemma shows there is mass 
in the interval [—y + M, M], which implies Theorem 2.6. But if —y < —M, 
then the lemma shows there is mass in the interval [—y + M, 2 — M] , which 
implies that the previous case holds, and in turn Theorem 2.6 holds as well. 



Lemma 3.3. Let M >1. Suppose Assumption 1 holds and n is a sta- 
tionary point of (9) . 

(a) If7r^y>l/{2L + l)forsome-y£{-L,-M] and tt^ < 1/{2L + 1) for 
all zG[-y + 1, -1], then YI^^-m'^z > i:f=m..(-y+M,2-M) > Ki/{2L + 

(b) If TT^y > D >ttq for some —y € [1 — M, —1], then 

v^^ . fr^ D D / 1 \\ 

> VTz > min D, — — , — — 1 . 

z^^^M V '2(l-6)'2(l-6)Ui JJ 

Proof, (a) We observe that —y + M < 1, therefore the cooperation 
intensity between sites —y and 1 is 1. We define 

-y-M M L 1-M 

z=-L z=-y-M+l z=M+l z=~y+M 
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where the last sum may be over an empty set, in which case that sum is 
defined to be 0. The fitness of sites —y and 1 are 

m_y{TT)=K_y(A+{l-b) K.ttX 

\ z=max{-j/+Af,2-M) / 

Cmin(l-A/,"j/+A/-l) \ 
A+{l-b) ^-^0- 

If m-y{7r) < mi(7r), then tti > 7r_.y > 1/(2L + 1) > Ki/{2L + 1) by Lem- 
ma 2.4(c) and we are done. Otherwise, m-y('Tr) > mi(7r) and since K-y < Ki, 
the two equations above imply 

M mm[l~M~y+M-l} 

K.y Y Kz^.>Ki J2 Kz^^> ' ' 



or _(_ 1 

2=max(-y+Af,2-A4') z=-y-M+l ~^ 



therefore 

M M 

2L + 1' 



z=msLx{-y+M,2~M) z=ma.x{~y+M,2-M) 

as required, 
(b) We define 

-y-M A/-1 L 

B= KzTTz + b Y K,n,+ Y KzTTz, 

z=-L z=-y-M+l z=M 

then the fitness of sites —y and (the cooperation intensity between these 
sites is b) are: 



zT^z , 



I M-1 

(13) m.yin) = K^y[B + {l-b) Y 

V z=-y+M J 

(14) mo(7r) = Ko('s + (l-6) Y • 

If m-y{7r) < m_y+A/('?i'), then 7r_j^+j\/ > 7r_.y by Lemma 2.4(c), and we are 
done. Otherwise, 

?n_.y(7r) > m_y+A/(vr) > K^y+uK-yl^-y > K_y+MK-yD. 

The above inequality and (13) imply that either 
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or 

K-y{l-b) K,TT,> 

z=-y+M 

or both. If K_y{l - b) T.l=l+M Kzrrz > K_y+MK^yD/2, then 



K-y+AlD 



y+M 
M-1 

< K^TI:, < K_y+M 



"^^^ ^) z=-y+M z=-y+M 



M-1 ^ 



since M — 1 > — y + M > 0, hence 

Af-l 

(15) y 7r^>— 

2 1-6 

z=~y+M ^ ' 

And if K_yB > K_.y+pjK_yD/2, then 

(16) B > 



K-y+uD 



Since ttq < D < TT-y, we have 7n_.y(7r) > mo('7r), then (13) and (14) imply the 
following: 



m.yjn) _ K^y{B + {l-b)Et~-y+MKzTTz) 
moiTT) Ko{B + {l-h) E;il,„M4-i KzT:,) ' 

1 ^ i? + (i-fc)Efi"4+M-^^^^ i-h 

K„„ - B B 



1 < 



1 + ^ E 



.^i^M^ (1-^)^-. - 2(1-6)K_, 
by (16). Since max^^[_y^M,M-^ = K_y+M, we obtain 

^ ^ ^,/^-2(l-6)A'„,- 
Inequalities (15) and (17) imply the desired result. □ 

Proof of Theorem 2.6. We prove this for the case of x G [— L,-l]. 
The other case of x G [1,-^^] is similar. Let —y be the right most (i.e., the 
largest) site in [— L, —1] where ir^y > 1/(2L + 1). We distinguish two cases, 
-y + M <1 and -y + M > 1. 

If —y + M < 1, then Lemma 3.3(a) implies that at least one site y' € 
[2 - M, M] has more mass than Ki/{2M - 1)(2L + 1). If y' G [1, M] then we 
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are done; y' cannot be by assumption; and if y' G [2 — Af, —1], then we use 
the next paragraph to estabUsh the result. 

Now we define D = Ki/{2M - 1)(2L + 1) and deal with the case of 
TT-y > D for some —y + M >1. This includes the case n^y > 1/(2L + 1) 
where —y + Af > 1. We only need to apply Lemma 3.3(b) to reach the de- 
sired conclusion. □ 



3.2.1. Relatively large fj,. For /i suitably large compared to b, we establish 
Proposition 2.8, which says that most of the mass is forced into 2 intervals 
on both sides of site 0, with little mass everywhere else. Theorem 2.6 im- 
plies that there must be significant mass in both these intervals, therefore 
speciation is likely to occur. For this, we first establish a lemma that gives 
a lower bound on the mean fitness m,r in terms of the mutation parameter 
A- 

Lemma 3.4. Let p= \M/2\. Suppose Assumption 1 holds and n is a 
stationary point of (9). // /i < 4Kp/{4L + 2)^, then > {jlKp/ A)"^/^ . 

Proof. If -n" is a stationary point of (9), then since rux > for any x, 
(12) implies 

i-(2L + l)<'^^ 



hence 



(2m^//i) + 2L + 1 
Since 2p > M, we have Bp__(^_p~^ = 1, therefore 



"i?!- ^ T^-pKpTTp > ( — ^ _ , r 1 > min 

and 



Kp Y^- (i^Kp Kp 



nT-TT > min 



(2m^//i) + 2L + iy - V4m^-'4L + 2 
fiiKp\^/'^ f Kp 



V 4 y 'V4L + 2 



If /i < 4:Kp/{AL + 2)^, then the above estimate reduces to the desired con- 
clusion. □ 



Proof of Proposition 2.8. We estimate the fitness of sites near 0. 
For X G [-/,/], 

rrixin) = bK^ ^ K^ttz + (1 - b)Kx ^ K^fcz 
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(18) <bKoYKo7^, + KoKn 

z z(^[x-M+l, x+M-1] 

(19) <h + Kn, 

where we use the fact that Kx is decreasing in [0, L] in the second line. And 
for X ^ [—n + l,n — 1], 

(20) mx{Tr) < Kx < Kn. 

Define ci = (flKp/A)'^/^ — b — Kn- Condition (10) implies that ci is positive, 
and (19), (20) and Lemma 3.4 applied to (12) imply that for x G [—L, —n] U 
[-l,l]U[n,L], 

{2L + 1)] =mif -mx{TT)>ci. 



2 \7r.^ 

Therefore for x G [—L, —n] U [— /, /] U [n, L], 



(2ci//i) + 2L + 1 - 2((/ii^p/4)2/3 -b-Kn) 
and the proof is complete. □ 

3.2.2. Small fx. Finally, we turn to the case of small fx, where we only 
consider local maxima of m,r and then we can use the perturbation result 
Propositions 2.2 and 2.3 to say something about 14- • 

We first establish Theorem 2.9, which says that if there is very little mass 
outside the interval (— [M/2J, [Af/2j), then the mass inside the interval 
(-[M/2J, [M/2\) is concentrated at site 0. 

Proof of Theorem 2.9. Since 2g - 1 < M, By^^ = 6 for y,z G {-q,q), 
therefore for x G (— g, 0) U (0, q), 

q-l 

mo{-K) = KoYB-zKzTTz >b Y ^^-Kz, 

z z=-q+l 



mxiTT 



I X+M-1 x-M L 

= Kxib Y KzTTz+ Y KzTTz+ Y ^^^^ 
\ z=x-M+l z=-L z=x+M 

/ Q-i -Q L \ 

<Kjb Y KzT^z+ Y KzlTz + Y^zTTz] 
\ z=-q+l z=-L z=q ) 

<Kjye^b Y 
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and 

q-l 

mo{-K) - m^{-K) > b{l - Ki) ^ K^vr^-Kie 

(21) >b{l-Ki)K,^i{l-e)-Kie 

= c>0 

a e <b{l - Ki)Kq_i/{b{l - Ki)Kg^i + Ki). Let J = (-g, 0) U (0, g), then 
Lemma 2.4(b) says that 

+ ^r^ ^ = "io(vr) + — , 



therefore 



ExGJ^x '^Hx&J^x 271-0 ' 



Exgj^x /iV27ro Y.x&j^x 

1 2/ 2c 

> 1 — ?Tio(vr) — maxm^.^vr) > — 

TTq /i V a;gj / /X 

by (21), which implies the desired conclusion. □ 

Now we focus on the local maxima that has their support spread M sites 
apart, which seem to be the only local maxima from simulation. Suppose a 
subset I of E = [— L, L] n Z has the properties that all x € / are at least M 
apart. Define 

A-^ = {vr G A : vr^ = for X ^ /}. 

We observe that A^ is a closed subset of A. We first establish Proposition 3.6 
below that states a condition necessary for a local maximum in A^ to be a 
local maximum in A, then establish Proposition 2.10, which has some results 
regarding the kinds of local maxima that various fitness functions can have. 



Lemma 3.5. Let k G R+ and 

kA^ = < (7r_i, . . . , ttl) : tt^,. > Vx G /, vr^,. = Vx ^ / and ^ tt^ = k> , 
[ x=-L J 

then TT G A^ is the unique local maximum of rn-n for vr lying in a small 
neighborhood in A^ if and only if kir G kA^ is the unique local maximum of 
for TT lying in a small neighborhood in kA^ . 
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Proof. If vf € is the unique local maximum of for vr lying in a 
small neighborhood in A^, then the Hessian matrix of 771^^ a-t vr is positive 
definite, and Lemma 3.1 implies that mx{Tf) are all equal for x I. Thus 
mxikn) = KxJ2z&i -^x-zKzkTTz — knixiTr) are all equal for x S / as well. 
This shows kn is a local extremum of for vr G kA^ . To verify it is a local 
maximum, we define /' = I\{p} where p is an arbitrary member of /, rewrite 
in terms of x G /', and calculate its first and second derivatives: 



dn,,, 



KxT^xBx-zK^TTz + 2Kp ik-J2'^x]Yl Bp~zKz1Tz 
x,z&I' \ x&I' ) zGl' 

+ BoK^Jk-J2^x] , 

\ xel' I 
'^Kw X] Bw-zKzTTz - 2Kp X Bp^zKzTTz 

z£r zer 
+ 2Kplk-Y,T^x]Bp-^K^-2BoK^[k-J2^x], 

\ x&I' ) \ xGl' / 



2Ky^By^^yKy 2KpBp^yKy 2KpBp^yjK^ -\- 2BqKp , 



where w,y (z I' . We observe that the second derivatives do not depend on k, 
therefore the Hessian matrix of fn.,^ is also positive definite at kn, and kn is 
the unique local maximum lying in a small neighborhood in kA^ . The proof 
of the reverse direction is similar. □ 



Proposition 3.6. Suppose if G A^ is the unique local maximum ofm.,^ 
for vr lying in a small neighborhood of tx in A^ , itx > for all x G /, and 
iTT'xi^) = for all X ^ I . 

(a) // mx{7r) < m2 < mi for all x ^ I, then vr is also the unique local 
maximum ofrnj^ for vr lying in a sufficiently small neighborhood in A. 

(b) If there exists y G E\I where my['k) = > mi, then vr is not a local 
maximum of for tt G A . 



Remark 3.7. If the set / consists of a singleton y, then 5y G A^ is 
trivially the unique local maximum of for vr lying in a small neighborhood 
of vr in A^, which is an empty set. In this case, to verify that 5y is also the 
unique local maximum of TnT^ for vr lying in a sufficiently small neighborhood 
in A, we only need to check that mx{Sy) < my{6y) for all x y. 
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Proof of Proposition 3.6. (a) We examine the fitness of tt in a small 
neighborhood 



A = S vr € A : max \tTx — t^xI < ^ 
xeE 



of vf in A: 



W-TT = KxT^xBx-zK^T^z + X! KxT^xBx-zKz 



(22) 



+ Kx'KxBx-zKz'Kz 

x<^i,zeE 



< Y KxT^xBx-zKzTTz + 2 Y Kx-KxBx-zKz 
xeI,zGl x0,z&E 

= Y Kx-KxBx-zKzT^z + 2 X! ^x^xItt). 
xG/,ze/ x^i 

Let c = J2x0 '^x- By Lemma 3.5, the configuration vri G (1 — c) A^ that maxi- 
mizes (locahy) the first term on the right-hand side of (22) is (1 — c)tt, which 
means that it satisfies 



Y KxTTxBx-zKzT^z < (1 

xei,z£i 



c)^mi 



if e in the definition of A is sufficiently small. If e is sufficiently small, then 
because mx{TT) is a continuous function of vr for all x, we have 



mx{TT) < m2 + 



mi — 1712 



for X ^ I. Applying the two estimates above to (22), we obtain 



"^7r < (1 — c) nil + 2c m2 + 



mi — m2 



mi — c{mi — m2) + (?mi < mi 



if c < (2L -|- l)e is strictly positive but sufficiently small. Hence vf is also the 
unique local maximum of fn.,^ for vr lying in a small neighborhood in A. 
(b) Let w G I, then vr^ > and m^(7r) = mi. Along the line 7r+ p{5y — 6^), 



dm. 



dp 



d 



7r=n,p=0 



d{TTy+p) 



m 



TT=TT,p=0 



''''x+p{Sy,x—Sm,x)' 



dp 



+ 



a 



d{TTw — P) n=i,p=0 

2{my{Tf) - m^(7f)) 
2(7713 - nil), 



'^T^x +P{&y,x-5w,x)' 



dj iTw - p) 
dp 
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which is strictly positive. Therefore vr is not a local maximum of m.„ for 

TT G A. □ 

Now we use the above result to establish Proposition 2.10, which explicitly 
computes some local maxima, when the dimension of A''^ is low enough (less 
than 3) to enable us to do hands-on computation. 

Proof of Proposition 2.10. (a) By Lemma 3.1, at any local maxi- 
mum vr, the support of vr must have equal fitness. We will show that either 
Xj -|- 1 or Xj^i — 1 is more fit than sites in {xi, . . . ,Xk}- If xj^i > 1, then since 
K is unimodal by assumption, K^^j^^-i > -ft^xj+i- Otherwise, xj+i < and 
Xj < -1, therefore Kr,.+i > Kr,.. So either i^x^+i-i > -f^x^+i or Kr,.+i > K^- 
or both. If Kx +i > Kx (the case of iT^: +i-i > -f^x +i is similar), then 



\z&{xi,...,Xj} ze{Xj + l,...,Xfc} 



Since Xj+i — {xj + 1) > A/, all B^^+i-z in the second sum above are 1 
and equal to B^^-z for z G {xj+i, . . . , j;fc} and B.^^+i-z > B^^-z for z G 
{xi, . . . , Xj} in the first sum above, therefore 



Xj 



.i{Ti)>Kx^+A J2 Bx^-zKznz+ Bx^-zKzffzj 

\ze{xi,...,Xj} ze{Xj+l,...,Xk} ' 

>Kx\ Y Bx^-zKznz+ Y Bx^-zKzTTzj 
\ze{xi,...,Xj} ze{xj+i,...,Xk} ' 

= mx,(^-), 

therefore vr cannot be a local maximum of by Proposition 3.6(b). 

(b) The fact that (5o is a stationary point of (9) is obvious; in fact, it 
holds for any K and h. But we need to check it is a local maximum of 
m,r if Km < b. We compute the fitness of all sites in E when tt = d^. For 
xG [-M-M,M- 1], 

bKx, if xG [-M + 1,M-1], 
Kx, ifxG [-L,-M]U[M,L]. 



mx{-K) 



If Km < b, then m^. < mo for all x ^ since K is increasing in [— L,0] and 
decreasing in [0,-L]. Proposition 3.6(a) and Remark 3.7 imply that 5q is a 
local maximum of m-,^. 

Now we deal with the case of Km > b. Since mjv/(5o) = Km > b = mQ{do), 
Proposition 3.6(b) and Remark 3.7 imply that 5q is not a local maximum of 
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(c) For vr = p5^x + i^— p)S~x+M where p is defined in the statement, brute 
force calculation shows that 



and 



Qp2 

which is < if bK^.^. + bK'^^j^^ < 2K_xK_x-\-m- This verifies that vr is the 
unique local maximum of m,r for vr lying in a small neighborhood of vr 
in 

It remains to check that all sites other than —x and —x + M are less fit, 
which calculations of fitness at these sites show to be true if b, K^x-Mi and 
K_x+2M are aU < K_xK-x+m{'^ + b)/{K_x + K_x+m)- 

(d) This result can be proved by brute force calculation, just like part (c). 
We omit the details. □ 
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